Torquemada turns a corner today, so it's appropriate to start a new file.
Here are some swatches from the crap that was dumped on me yesterday:
@head:GRAPHICAL USER INTERFACES
@sub:MARKET OVERVIEW
@body:This market update reports on the graphical user interface (GUI ) market including all major GUIs, windowing interfaces and vendor strategies covered. Namely, the Unix variations: Sun/AT&T’s OpenLook, NeXT’s NeXTstep, and Open Software Foundation’s OSF/Motif; and Microsoft’s Windows 3.0, Apple’s Macintosh interface and IBM’s Presentation Manager (PM).
Saugus, MA 01906 Manhasset, NY 11030
Route 1 Saugus 1400 Northern Blvd.
Framingham, MA 01701 Huntington, NY 11746
Lechmere Mall, Cochituate Rd. & Whittier St. 350 Route 110
Worcester, MA 01608 Fresh Meadows, NY 11365
200 Front Street 187-04 Horace Harding Expressway
Burlington, MA 01803 Scarsdale, NY 10583
Route 3A & Winn St. Midway Shopping Center, 935 Central Park Ave.
Danvers, MA 01925 Copiague, NY 11726
Liberty Tree Mall, Endicott St. 1101 Sunrise Highway
Don't those ALL CAPS heads look nice? The copy was consistent enough that I was able to blow in the tags easily enough. But what I wanted was an [upst/XXX]. I could get the prefix, but not the suffix. This one is not on deadline, so I'll wait to do this with the Fix Heads set included in this archive.
And how about that nice job of botching up what is really a very simple task, making a running list of store addresses? This one I had to do manually, one chop, one drop, one line at a time. Bah! This is fixable now with the two sets enclosed, Preserve Left Column and Preserve Right Column. These sets should only be run on isolated swatches of text, since they will happily toast, for instance, this file as a whole.
I won't even _show_ the table with the negative sums shown with minus signs. The excessively complicated OLD Minus to In-The-Hole set is how I solved the problem yesterday, and it failed in about 10% of the cases. The NEW Minus to In-The-Hole set works fine, in fine and terse fashion. Follow along by opening that set, because I want to stress a couple of points. First, I am forcing to absolute uniqueness with the ^#. It's probably redundant, but ^* will match _ANYTHING_ until it finds its terminator; better safe than sorry. Second, each of the three strings is run twice because Torquemada replaces the _whole_ search-for text with the _whole_ replace-with text. The ^_ that successfully terminates the first find is _also_ the ^t that begins the next. If you are running searches that need to see the same one character more than once, then you have to run with additional searches; variations on ^tBLAH^t are the most common form of this.
Now: What is happening?
Two new wildcards are added: ^* and ^_.
The simple one first: ^_ will match any _space_ character, just as ^# matches any number and ^! matches any punctuation. This is useful for uniqueness, but it's also preternaturally useful for cutting down on the number of searches needed to do a job, which should make headlines in the Southern Hemisphere. The space characters matched are tab, return, space and ASCII 202, which is option-space from the keyboard. Note that this will _not_ match <\f> or any other text _representation_ of a space character. It also will not match form feed or vertical tab, since those are being isolated at read time. The NEW Minus to In-The-Hole set shows very effective use of this new wildcard.
And then, the _wild_ one: ^* will match and _store_ any run of charcaters _until_ the next character in the search string is found. If there is no charcter after ^*, everything to the end of the _buffer_ is stored. Since you have no way of knowing where the buffer ends, you are well advised to follow ^* with _something_. OTOH, if you want to make a zero-length file, this will do it:
^*
Search for anything, replace with nothing. Has a certain Congressional elegance to it...
FYI, QED: You can throw things away, as is demonstrated in Preserve Right Column and its sinister twin.
And, as always with the Inquisitor, you can resequence, as is shown in Fix Heads. Note that ^* matches _up to but not including_ the terminator character. You don't have to use either ^* or the terminating character on the replace side, and, if you do, you don't have to show them in the same sequence. The whole point is to make it possible to insert text after a known starting point and before a known ending point, so ^* behaves exactly like the asterisk in the Unix command line (and unlike the asterisk in the DOS command line, which has inflexible and rigidly hard-coded termination).
A ^* can be terminated with:
Literal text (not aliased, not wild, just text)
The ^t, ^p and ^^ aliases (these are actually literal text at search time)
The ^#, ^! and ^_ wildcards (because these are 'typed' we can trust them)
A ^* _cannot_ be terminated with:
A ^0, ^1, ... ^9 wildcard (since these match _ANY_ character, the ^*'d text would be zero bytes in length, which kind of defeats the purpose)
A ^* (for the same reason, among others)
Of course, I don't just makes jokes about sinlessness, I strive to design sinless software. So if you screw up and use one of these illegal constructs, you won't crash. Your results may not match your intentions, but they will match your request.
That's it. This is way awesome cool, and if you don't take advantage of it, you're likely to be replaced by a wilder text massager (grin).
And _this_, I swear, is the end for a while. Sure, sure sure....
Very Best,
Greg Swann
4/9/92
Further notice:
It turns out that I was worried about nothing. As your fee for waiting these 24 hours, I've added two more typed wildcards. These are:
^+ (plus) which will match any one UPPERCASE character, and
^- (minus) which will match any one lowercase character.
These work like the other typed wildcards (number, punctuation and space): only the characters that meet the test are considered to match. In that way, they are useful for uniqueness. In conjunction with ^*, they are also useful for conditional processing. This is demonstrated in Fix (ALL CAPS) Heads; this works the same as Fix Heads except that _only_ heads that are typed in ALL CAPS will be replaced; heads already in upper and lower case are left alone.
'UPPERCASE' and 'lowercase' _includes_ the appropriate eight-bit accented characters. And as a reminder, 'punctuation' also includes the eight-bit Macpunct. God is in the details...
Persons who live in other days are, of course, referred to Torquemada Prefs. I just can't bloody help myself!
Greg Swann
4/10/92
Further, Further Notice:
One additional typed wildcard is added: ^± will match any one _alphabetical_ character, upper or lower case. This was suggested by Shane Stanley and is useful for uniqeness and for terminating ^* strings (e.g., Blah^*^± can mean soak up and store characters that are _not_ alphabetical).
By popular demand version numbers have been added. I'm not real crazy about this, since I hate ResEdit, but, as this is the sixth publicly released Torquemada, it makes a certain kind of sense.